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Claims 

What is claimed is: 

1 . A data analyzer for vise with a pattern classifier to compress a set of indexed data, 
comprising a data removal module for identifying and removing portions of the set of 
indexed data having insufficient discriminatory power based on the ensemble 
statistics of the set of indexed data. 

2. The data analyzer according to claim 1, wherein the data removal module 
comprises a common characteristic removal module comprising means for 
identifying and removing common characteristics of the set of indexed data based on 
the ensemble statistics of the set of indexed data, 

3. The data analyzer according to claim 1, wherein the data removal module 
comprises a noise removal module comprising means for identifying and removing 
noise portions of the set of indexed data based on ensemble statistics of the set of 
indexed data. 

4. The data analyzer according to claim 3, wherein the data removal module 
comprises a common characteristic removal module comprising means for 
identifying and removing common characteristics of the set of indexed data based on 
the ensemble statistics of the set of indexed data. 

5. The data analyzer according to claim 4, comprising a normalization means for 
normalizing the indexed data. 

6. The data analyzer according to claim 5, wherein the normalization means is 
configured to process the indexed data prior to processing by the common 
characteristic removal module. 
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7. The data analyzer according to claim 5, wherein the normalization means is 
configured to process the indexed data after processing by the common characteristic 
removal module. 

8. The data analyzer according to claim 5, wherein the normalization means 
comprises means for normalizing a member of the set to the standard deviation of the 
member. 

9. The data analyzer according to claim 5, wherein the normalization means 
comprises means for normalizing a member of the set to the maximum value of the 
member. 

10. The data analyzer according to claim 4, wherein the set of indexed data 
comprises indexed control-data and the common characteristic removed module 
comprises means for analyzing the indexed control-data to identify the portions of the 
set of indexed data that contain common characteristics. 

1 1 . The data analyzer according to claim 4, wherein the common characteristic 
removal module comprises a threshold means for identifying the portions of the 
indexed data that contain common characteristics. 

12. The data analyzer according to claim 11, wherein the threshold means calculates 
the threshold relative to an ensemble statistic of the set of indexed data. 

13. The data analyzer according to claim 11, wherein the threshold means comprises 
means for removing an index from the indexed data having an ensemble variance 
higher than the threshold value. 

14. The data analyzer according to claim 4, wherein the noise removal module 
comprises a threshold means for identifying the portions of the indexed data that 
contain noise. 
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15. The data analyzer according to claim 14, wherein the threshold means calculates 
the threshold relative to the ensemble variance of the set of indexed data. 

16. The data analyzer according to claim 14, wherein the threshold means comprises 
means for removing an index from the indexed data having an ensemble variance 
lower than the threshold value. 

17. The data analyzer according to claim 4, wherein the common characteristic 
removal module comprises means for decreasing the cardinality of the set of indexed 
data. 

18. The data analyzer according to claim 17, wherein the means for reducing 
cardinality comprises means for removing a portion of the data from a member of the 
indexed set. 

19. The data analyzer according to claim 4, wherein the noise removal module 
comprises means for decreasing the cardinality of the set of indexed data. 

20. The data analyzer according to claim 19, wherein the means for reducing 
cardinality comprises means for removing a portion of the data from a member of the 
indexed set. 

21. The data analyzer according to any one of claims 1-20, comprising: 

a feature extraction module for extracting a feature portion from the 

compressed indexed data to provide a set of feature indexed data; and 

a classification module for classifying the feature indexed data to provide 
pattern classification of the set of indexed data. 

22. A data analyzer for use with a pattern classifier to compress a set of indexed data 
having common characteristics and noise, comprising: 
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a. 



means for determining a common characteristic threshold for the 
indexed data set; 



b. 



means for removing indices having ensemble statistics higher than the 
common characteristic threshold value to provide a retained dataset; 
means for calculating ensemble statistics of each retained index in the 
retained dataset; 



c. 



d. 



means for determining a noise threshold; 

means for removing indices from the retained dataset having an 

ensemble statistic lower than a noise threshold value; and 



e. 



f. 



means for normalizing the indexed data. 



23. The data analyzer according to claim 22, wherein the normalization means is 
configured to process the indexed data prior to processing by the common 
characteristic threshold means. 

24. The data analyzer according to claim 22, wherein the normalization means is 
configured to process the indexed data after processing by the common characteristic 
threshold means. 

25. The data analyzer according to claim 22, wherein the normalization means 
comprises means for normalizing a member of the set to the standard deviation of the 
member. 

26. The data analyzer according to claim 22, wherein the normalization means 
comprises means for normalizing a member of the set to the maximum value of the 
member. 

27. A method for filtering spectral data from a set of spectra to remove common 
characteristics and noise, comprising: 
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identifying and removing the common characteristics from each spectrum 

within the set of spectra based on the ensemble statistics of the set of 
spectra, and 

identifying and removing the noise portions of each spectrum based on 

ensemble statistics of the set of spectra, 
whereby a filtered spectra is provided in which the common characteristic and 
noise portions have been removed. 

28. A method for analyzing a set of indexed data to compress the set of data, 
comprising the steps of identifying and removing portions of the set of data having 
insufficient discriminatory power based on ensemble statistics of the set of indexed 
data, thereby providing a set of compressed indexed data. 

29. The method according to claim 28, wherein the step of identifying and removing 
portions of the set of data comprises identifying and removing common 
characteristics of the set of data based on ensemble statistics of the set of indexed 
data. 

30. The method according to claim 28, wherein the step of identifying and removing 
portions of the set of data comprises identifying and removing noise portions of the 
set of indexed data based on ensemble statistics of the set of indexed data. 

3 1 . The method according to claim 30, wherein the step of identifying and removing 
portions of the set of data comprises identifying and removing common 
characteristics of the set of data based on ensemble statistics of the set of indexed 
data. 

32. The method according to claim 31, comprising the step of normalizing the 
indexed data 
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33. The method according to claim 32, wherein the step of normalizing is performed 
prior to the step of identifying and removing common characteristics. 

34. The method according to claim 32, wherein the step of normalizing is performed 
after the step of identifying and removing common characteristics. 

35. The method according to claim 32, wherein the step of normalizing comprises 
normalizing a member of the set to the standard deviation of the member. 

36. The method according to claim 32, wherein the step of normalizing comprises 
normalizing a member of the set to the maximum value of the member. 

37. The method according to claim 32, wherein the set of indexed data comprises 
indexed control-data and the step of identifying and removing common 
characteristics comprises analyzing the indexed control-data to identify the portions 
of the set of indexed data that contain common characteristics. 

38. The method according to claim 32, wherein the step of identifying and removing 
common characteristics comprises identifying the portions of the indexed data that 
contain common characteristics based on comparison to a threshold value. 

39. The method according to claim 38, wherein the threshold is calculated based on 
an ensemble statistic of the set of indexed data. 

40. The method according to claim 38, wherein the step of identifying and removing 
common characteristics comprises removing an index from the indexed data having 
an ensemble variance higher than the threshold value. 

41. The method according to claim 31, wherein the step of identifying and removing 
the noise portions comprises identifying the portions of the indexed data that contain 
noise based on comparison to a threshold value. 
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42. The method according to claim 41, wherein the threshold is calculated based on 
an ensemble statistic of the set of indexed data. 

43. The method according to claim 41, wherein the step of identifying and removing 
the noise portions comprises removing an index from the indexed data having an 
ensemble variance lower than the threshold value. 

44. The method according to claim 31, wherein the step of identifying and removing 
common characteristics comprises decreasing the cardinality of the set of indexed 
data. 

45. The method according to claim 44, wherein reducing cardinality comprises 
removing a portion of the data from a member of the indexed set. 

46. The method according to claim 31, wherein the step of identifying and removing 
common characteristics comprises computing an ensemble variance of a set of 
control spectra. 

47. The method according to claim 3 1 , wherein the step of identifying and removing 
noise portions comprises computing an ensemble variance of a set of control spectra. 

48. The method according to claim 3 1 , wherein the step of identifying and removing 
the noise portions comprises decreasing the cardinality of the set of indexed data. 

49. The method according to claim 48, wherein reducing cardinality comprises 
removing a portion of the data from a member of the indexed set. 

50. The method according to any one of claims 1-49, comprising the steps of: 

extracting a feature portion from the compressed indexed data to provide a set 
of feature indexed data; and 



45 



WO 2004/057524 



PCT/US2003/040677 



classifying the feature indexed data to provide pattern classification of the set 
of indexed data, 

51. A method for classifying a set of indexed data which include a set of control 
spectra, comprising the steps of : 

a. calculating an ensemble statistic at each index in the control spectra; 

b. identifying those indices at which the ensemble statistic exceeds a first 
selected threshold; 

c. removing the identified indices from all spectra in the set of indexed 
data to provide a set of compressed indexed data; 

d. calculating an ensemble statistic at each index of the compressed 
indexed data; 

e. removing all indices from each compressed spectrum that have an 
ensemble statistic that is lower than a second selected threshold value 
to provide a set of reduced indexed data; 

f. extracting a feature portion of each of the reduced indexed data to 
provide a set of feature spectra; and 

g. classifying the set of feature spectra into clusters. 

52. The method according to claim 51, wherein the step of calculating an ensemble 
statistic at each index in the control spectra comprises computing the ensemble 
variance of the control spectra. 

53. The method according to claim 51, wherein the step of calculating an ensemble 
statistic at each index of the compressed indexed data comprises computing the 
ensemble variance of the compressed indexed data. 
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